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(54) Power management methode 

(57) In a method embodiinent, there are steps (1 60) 
to operate a microprocessor having at least one counter 
located on the microprocessor. The method operates 
functional circuitry on the microprocessor over a plural- 
ity of clock cycles. This operation causes an on-chip ac- 
tivity to occur at least once during the plurality of clock 
cycles. The method also advances acount in the at least 



one counter In response to each incidence of the on- 
chip activity. After this advancement of the count, the 
method predicts (164) the busyness of the microproc- 
essor in response to the count In the counter Finally, 
the method selectively adjusts power consumption 
(168) of the microprocessor in response to a compari- 
son of the predicted busyness with a threshold. 
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Description 

[0001] The present embodiments r late to power management In computers, and are more particularly directed to 
microprocessor circuits, systems, and methods for adjusting microprocessor power consumption in response to on- 
chip activity. 

[0002] As computer systems advance in development, various techniques are evolving to produce more power ef- 
ficient machines. For example, In the instance of portable computers such as laptops and noteboolc computers, it is 
desirable to Improve power efficiency so that the rechargeable power supply lasts a greater amount of time between 
recharge periods. Therefore, various techniques have arisen to reduce power consumption in these types of computers, 
particularly during periods of reduced activity or non-use. such as when the user has not operated the keyboard for a 
particular amount of time. Power consumption techniques also arise in the context of deslrtop computers. For example, 
many users prefer to leave their computers turned on during lengthy periods of non-use. and even overnight for pur- 
poses of serving other computers, convenience, receiving facsimile transmissions, or simply to avoid a lengthy boot- 
up procedure upon returning to the computer the next day. During these periods of non-use, power reduction Is also 
beneficial. Lastly, the combination of portable computers and desktop computers in so-called docking bay configura- 
tions also benefits from power adjustments, particularly in instances where the portable computer is removed from the 
dock, thereby changing the power consumption considerations. These varying configurations each may benefit from 
Improved power performance, 

[0003] One current approach to power reductton in computer systems is directed to operations of devices peripheral 
to the central processing unit CCPU') of the system. In these systems, often power ts reduced or eliminated to one or 
moreof the peripherals duringperkxJsof detected non-useof the peripheral. For example, underthe current WINDOWS 
95 operating system, a user may input a time period in connectfon with the computer monitor, and when the monitor 
remains idle (i.e., the display it depicts remains unchanged) for that period of time, one or more control signals are 
issued to the monitor so that it enters a reduced power state. Thereafter, when activity with respect to the monitor 
commences (i.e., an action is taken which should be manifested by a change in the image displayed by the monitor), 
then again one or more control signals are issued to the monitor, but here to return the monitor to its fully operational 
state. Note furtherthat the monitor is merely used by way of example, while the reduction of power to other peripherals 
is known in the art. 

[0004] Another current approach to power reduction in computer systems is directed to operations of the actual CPU 
of the system. Under this approach, again the activity of peripherals, or lack of such activity, is used to predict system 
burden and control power considerations. In these systems, power consumpton of the CPU is reduced rather than 
reducing power to the peripheral(s). Typically, such a reduction occurs by Wentifying a manifestation of low peripheral 
activity and, In response, predicting that less than full CPU activity Is required. Based on this prediction, power is 
reduced to the CPU by reducing its clock frequency such as through the use of an external clock circuit providing the 
clock signal to the CPU, 

[0005] Note that the prior art approaches described above suffer from various drawbacks. For example, systems 
which reduce power only to system peripherals may be Ignoring one of the more major power consumers, namely, the 
CPU. In other words, the CPU may be one of the major, if not the largest, consumer of power In the system. Moreover, 
those power savings systems which are directed to the CPU by reducing Its power also suffer drawbacks. For example, 
if the prediction of inactivity is too aggressive or is improper, then the CPU functionality may be reduced at a time when 
higher CPU operabilrty is desired or necessary. I ndeed, this may lead to software compatibility problems because some 
software may fail if it does not receive enough computing time. Thus, one skilled in the art will appreciate that reducing 
the CPU clock speed to consequently reduce power consumption during such a time may result in such a software 
failure. As another drawback, note that Improper or inefficient power reduction may cause user f rustratton. In response, 
in systems where power controls are user-alterable, the user may disable the power consumption feature in its entirety 
This action therefore actually increases net overall power consumption by disabling the feature which could otherwise 
reduce power consumption during at least some periods of CPU inactivity. 

[0006] In view of the foregoing, there arises a need to improve upon the prior art and provide a system for more 
accurately reducing computer system power consumption while reducing the drawbacks set forth above. 
[0007] An illustrative embodiment of the present invention seeks to provide a method for operating a microprocessor 
that avoids or minimizes above-mentioned problems. 

[0008] Aspects of the invention are specified in the claims. In carrying out principles of the present invention, a 
method Includes steps to operate a microprocessor having at least one counter located on the microprocessor. The 
method operates functional circuitry on the microprocessor over a plurality of clock cycles. This operation causes an 
on-chip activity to occur at least once during the plurality f clock cycles. The method also advances a count in the at 
least one counter in response to each incldenc of the on-chip activity during th plurality of clock cycl s. After this 
advancement of the count, the method predicts the busyness of the microprocessor in respons^ to the count in the 
counter. Finally, the method selectively adjusts power consumption of the microprocessor in response to a comparison 
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of the predicted busyness with a threshold. Other circuits, systems, arwl methods ar also disclos d and claimed. 
For a better understanding of th present invention, ref erenc will now be made, by way of example, to the accompa- 
nying drawings. \n which: 

Figure 1 Illustrates an electrical block diagram of a microprocessor in which the preferred embodiment may be 
implemented; 

Figure 2 illustrates an electrical block and functional diagram of ctock generation and control circuitry 1 20 of Figure 
1;and 

Figure 3 Illustrates a flow chart of the preferred method steps of operation of clock generation and control circuitry 
120 of Figure 2. 

[0009] Referring to Figure 1 , an exemplary data processing system 102, including an exemplary superscalar pipe- 

is lined microprocessor 1 1 0 within which the preferred embodiment Is implemented, is described. The following discussion 
first oven/iews system 102 and given the understanding derived from that oven^iew, then addresses the context of 
improving power consumption within system 102 by reducing power consumption of microprocessor 110. Note also 
that it is to be understood that the architecture of system 102 and of microprocessor 110 is described herein by way 
of example only, as it Is contemplated that the present embodiments may be utilized in microprocessors of various 

20 architectures. It is therefore contemplated that one of ordinary skill in the art. having reference to this specification, will 
be readily able to implement the present embodiments in such other microprocessor architectures. It is further con- 
templated that the present embodiments may be realized In single-chip microprocessors with the manufacture of such 
integrated circuits accomplished according to silicon substrate, slllcon-on-insulator. gallium arsenide, and other man- 
ufacturing technologies, and using MOS. CMOS, bipolar, BICMOS, or other device Implementattons. 

2S [0010] Microprocessor 110. as shown in Figure 1 . is connected to other system devices by way ot bus B. While bus 
B in this example, is shown as a single bus, it is of course contemplated that bus B may represent multiple buses 
having different speeds and protocols, as is known In conventional computers utilizing the PCI local bus archrtecture; 
thus, bus B is illustrated as a single bus here merely by way of example and for its simplicity. System 102 contains 
such conventtonal subsystems as communication ports 103 (Including modem ports and modems, network interfaces, 

30 and the like), graphics display system 1 04 (including video memory, video processors, a graphics monitor), mam mem- 
ory system 105 which is typically implemented by way of dynamic random access memory (DRAM) and includes a 
stack 107. input devices 106 (including keyboard, a pointing device, and the interface circuitry therefor), and disk 
system 108 (which may Include hard disk drives, floppy disk drives, and CD-ROM drives). It is therefore contemplated 
that system 102 of Figure 1 corresponds to a conventional desktop computer or workstation, as are now common in 

35 the art. Of course, other system implementations ot microprocessor 110 (e.g., portable-type computing devices) can 
also benefit from the present embodiments, as will be recognized by those of ordinary skill in the art. 
[0011] Microprocessor 110 includes a bus Interface unit ("BIU") 112 that is connected to bus B. and which controls 
and accompHshes communication betweenmicroprocessorllOand the other elements In 8ystem102.BIU^ 
the appropriate control and clock circuitry to perfomi this function, including write buffers for improving throughput and 

40 Including timing circuitry so as to synchronize the results of internal microprocessor operation with bus B timing con- 
straints Microprocessor 110 also includes clock generation and control circuitry 120 which. In this' exemplary micro- 
processor 110, generates internal clock phases; the frequency of the internal ckx:k phases, in this example, may be 
selectably programmed as a multiple of the frequency of the Input clock. Additionally, as detailed later in connection 
with Figures 2 and 3. clock generation and control circuitry 1 20 in the preferred embodiment is operable to alter the 

45 system clock frequency for microprocessor 110 to reduce power consumption during periods of perceived low micro- 
processor on-chip activity, uu UI«K«*,t «f 
[00121 As Is evident in Figure 1. microprocessor 110 has three levels of Internal cache memory, with the highest of 
these as level 2 cache 114. which Is connected to BlU 112. In this example, level 2 cache 114 is a unified cache, and 
is configured to receive all cacheable data and cacheable Instructions from bus B via BlU 112. such that much of the 

60 bus traffic presented by microprocessor 110 is accomplished via level 2 cache 114. Of course, mfcroprocessor no 
may also accomplish bus traffic around level 2 cache 114 by treating certain bus reads and writes as ^f^^^^^^^^^^ 
Level 2 cache 114. as shown in Figure 1. is connected to two level 1 caches 116; level 1 data cache 116^, is dedicated 
to data, while level 1 1nstruction cache 116, is dedicated to Instructions. Power consumption by microprocessor 11 0 is 
minimized by accessing level2cache114only in the event of cache misses of the appropriate oneof the levellcach^^^ 

55 116, Furthemiore. on the data side, microcach 118 is provided as a level 0 cache, which In this example is a fully 

dual-ported cache. , . x. • 

100131 As shown in Figur 1 and as noted above, microprocessor 110 is of th superscalar type. In this example 

multipl execution units are provided within microprocessor 110. allowing up to four instructions to be simultan ousV 
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executed in parallel for a single instruction pointer entry. These execution units include two ALUs 1 4^, 1 42^ for process- 
ing conditional branch, integer, and toglcal operations, floating-point unit (FPU) 130, two load-store units 140o, 140^, 
and microsequencer 148. TTie two load-store units 140 utiliz the two ports to microcache 118, for true parallel acc ss 
thereto, and also perform load and store operations to registers In register file 139, as well as to the level 1 caches 
11 6d and I16j. Data microtranslation lookaside buffer ftiTLB) 138 is provided to translate logical data addresses into 
physical addresses, in the conventional manner. 

[0014] The multiple execution units are controlled by way of multiple pipelines with seven stages each, with write 
back, which in some architectures may be thought of as an eighth stage which also may be referred to as instruction 
graduation. The pipeline stages are as follows: 



F 


Fetch: This stage 


1 generates the instoiction address and reads the instruction from the instruction cache or 




menrK)ry 




PDO 


Predecode stage 


» 0: TTiis stage detemiines the length and starting position of up to three fetched x86-type 




instructions 




PD1 


Predecode stage 


1 : This stage extracts the x66 instruction bytes and recodes them into fixed length format 




for decode 




DC 


Decode: This stage translates the x86 Instructions into atomic operations (AOps) 


SC 


Schedule: This stage assigns up to four AOps to the appropriate execution units 


OP 


Operand: This stage retrieves the register and/or memory operands indicated by the AOps 


EX 


Execute: This st£ 


ige runs the execution units according to the AOps and the retrieved operands 


WB 


Write back: This 


stage stores the results of the execution In registers or in memory 



[0015] Referring back to Figure 1. the pipeline stages noted above are perfomied by various functional circuitry 
bkx:k8 within microprocessor 110. Fetch unit 126 generates instructbn addresses from the instruction pointer, by way 
of instmction micro-translation lookaside buffer (\iTiB) 1 22, which translates the logical Instnjction address to a physical 
address in the conventtonal way, for application to level 1 Instojction cache 116j. Instoiction cache 116} produces a 
stream of instoiction data to fetch unit 126. which in turn provides the instmction code to the predecode stages in the 
desired sequence. 

[0016] Predecodlng of the instructk)n8 Is broken into two parts in mteroprocessor 110, namely, predecode 0 stage 
128 and predecode 1 stage 1 32. These two stages operate as separata pipeline stagss. and together operate to locate 
up to three x86 Instructions and apply the same to decoder 134. As such, the predecode stage of the pipeline in 
microprocessor 110 is three instructions wide. Predecode 0 unit 128, as noted above, determines the size and position 
of as many as three x86 instructions (which, of course, are variable length), and as such consists of three Instruction 
recognizers; predecode 1 unit 1 32 recodes the multi-byte instructions into a fixed-length format, to facilitate decoding. 
[0017] Decode unit 134, in this example, contains four instruction decoders, each capable of receiving a fixed length 
xB6 Instructton from predecode 1 unit 1 32 and producing from one to three atomic operattons (AOps); AOps are sub- 
stantially equivalent to RISC instructions. Three of the four decoders operate in parallel, placing up to nine AOps into 
the decode queue at the output of decode unit 134 to await scheduling; the fourth decoder is rosen/ed for special 
cases. Scheduler 136 reads up to four AOps from the decode queue at the output of decode unit 1 34. and assigns 
these AOps to the appropriate execution units. In addition, the operand unit 144 receives and prepares the operands 
for execution. As indicated in Figure 1 , operand unit 1 44 receives an input from scheduler 1 36 and also from microcode 
ROM 1 48. via multiplexer 1 45, and fetches register operands, and/or memory operands via load/store units 1 40o and 
or 140^. for use in the execution of the instructions. In additbn, according to this example, operand unit 144 performs 
operand fonvarding to send results to registers that are ready to be stored, and also performs address generation for 
AOps of the load and store type. 

[0018] Microsequencer 148, in combination with mtorocode ROM 146, controls ALUs 142 and loadfetore units 140 
in the execution of microcode entry AOps. which are generally the last AOps to execute in a cycle. In this example, 
microsequencer 1 48 sequences through microinstnjctions stored in microcode ROM 1 46 to accomplish this control for 
those microcoded microinstnictions. Examples of microcoded microinstructions include, for microprocessor 11 0, com- 
plex or rarely-used x86 instructions, x86 Instoictions that modify segment or control registers, handling of exceptions 
and interoipts, and multicycle instructions (such as REP Instructions, and instructions that PUSH and POP all registers). 
[001 9] Microprocessor 1 1 0 also includes circuitry 1 24 for controlling theoperation of JTAG scan testing, and of certain 
built-in self-test functtons, ensuring the validity of the operation of microprocessor 110 upon completion of manufac- 
turing, and upon resets and other events. 

[0020] Figure 2 illustrates a block and functional diagram of clock generation and control circuitry 1 20 from Figure 
1 in greater detail. In the preferred embodiment, ckx:k generation and controrblrcultry 1 20 includes a plurality of counters 
which, in Figure 2. are generally indicated by the reference numeral 150, and wh re the id ntifier for each different 
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counter is combined with a subscript s as to distinguish it from other on s of the counters. Thus, Figure 2 specifically 
illustrates counters 1 500 through 1 50^, and as demonstrated betaw N may be any integer number so as to accomplish 
th pr f rred functionality of thos c unters. Each of count rs 150o through 150n is pref rably a saturatable binary 
counter, meaning it may be reset t an initial value and then advance on a binary scale from the initial value towand 

5 some Ifrnit. where reaching that limit effectively "saturates' the counter. In other words, once the limit is reached, the 
count no longer advances until the counter Is either reset or advanced In an opposite direction away from the limit, 
both of which are discussed later. Also discussed later is the manner of determining the specific limit for each of counters 
1 500 through 1 50^. Note also that it Is stated that the count for each of the counters advances from the reset value, 
meaning the count functionality may be achieved by the count from a relatively low number at reset toward a larger 

10 number, or vice versa. For simplification purposes, from this point fonivard the counting function is often discussed in 
the context of a reset to a zero count, and then an advancement from that zero value to a larger number which represents 
the saturating limit for the counter Lastly, note that counters ISOq through 150^ may be constructed as standard digital 
registers, and are preferably located in the machine specific register {'MSR') space of microprocessor 110. 
[0021] Each of counters 1500 through 150^ produces a corresponding count designated as COUNTO through 

IS COUNTN in Figure 2. The count values are readable by a count evaluation function block 152. In other words, since 
the counters are part of the MSR space, then those values are readily available for further processing operations based 
on those values. In this regard, in the preferred embodiment count evaluation function block 1 52 is actually perlomned 
by software programming of microprocessor 110 and, therefore, the block Is shown with dashed lines In Figure 2. In 
other words, the functionality of block 1 52, as detailed later, is accomplished by executing software and such executton 

20 is accomplished using the components of microprocessor 110 as shown in Figure 1. In an alternative embodirnent, 
block 1 52 may be achieved by including additional hardware within microprocessor 11 0. In any event, and as detailed 
below in connection with Figure 3, the functionality of bkx:k 152 evaluates one or.more of the count values COUNTO 
through COUNTN and based on that evaluation provides an adjustment signal ADJ to a clock speed generator 154. 
Moreover, In an altemative Implementation also discussed below, bkxjk .1 52 may further receive and evaluate counter 

2S values from counters exlemal from mtoroprocessor 110 In addition to those (or a subset of those) located on micro- 
processor 110. 

[0022] Clock speed generator 154 may be constructed according to principles known in the art, and operates to 
output the ckx:k signal (abbreviated as CLK on Figure 2) to the various clocked circuits of microprocessor 11 0, In other 
words, generator 154 prbvldes what is often referred to as a system clock signal. Thus, the speed of the CLK signal 

30 represents the general speed of operation of mtoroprocessor 110. In the. preferred embodiment, the speed of the CLK 
signal is alterable In response to the ADJ signal from count evaluatton function block 152. Thus, for reasons appreciated 
later, the level of ADJ (which could be either analog or digital) operates to either maintain, increase, or decrease the 
speed of the CLK signal, thereby affecting the overall speed of operation of microprocessor 1 1 0. Note that the preceding 
discussion mentions the "speed" of operatton rather than its frequency In particular, it is contemplated within the present 

35 inventive scope that clock speed generator 1 64 be constructed in whatever fashion desirable to give rise to the ability 
to adjust microprocessor operation speed and. hence, to consequently adjust the power consumption of the micro- 
processor In response to the change in operational speed. For example, one of at least two techniques may be imple- 
mented to adjust microprocessor speed. As a first technique, the duty cycle of a general digital signal may be altered 
to provide the CLK signal and thereby adjust the speed of microprocessor operation. In this approach, a digital periodic 

40 signal is provided, but various transitions of that signal are gated off so that only the remaining transitions are presented 
by the system CLK signal. Thus, while the frequency of the general digital signal remains unchanged, the actual CLK 
signal used to operate the microprocessor circuitry provides fewer transitions because other transittons are gated off. 
In such an approach, however, there may be additfonal considerations required when removing the gate and allowing 
the CLK signal to Increase in speed due to some instabilities in the first few transitions following gate removal. As a 

45 second technique, the actual frequency of the CLK signal may be reduced, although this technique may require more 
complex circuitry than the first lechnque described immediately above. 

[0023] Each of counters 1 500 through 1 50^ is related to a different type of microprocessor activity, where that activity 
is manifested by one or more internal operations of the functional circuitry on microprocessor 110 (othenwise referred 
to herein as "on-chip" activity). The activity corresponding to each of the counters is detailed below. However, to better 

60 demonstrate the intended scope and operation by way of example, the activity with respect to counter ISQq is riow 
described. The count of counter ISOq Is generally related to the execution of Instructions. In general, therefore, and as 
further rrwdifled by statements below, it is desirable for counter 160© to advance each time an instruction is executed. 
More particularly, corresponding to counter 150o are two inputs signals, whteh actually are coupled first to a took up 
mask block 1 56o for reasons described later. The first of the input signals, I NSTTYPE, kJentif ies the type(s) of instruction 

BS being executed bythe execution units (i.e., ALUs 142o, 142^, FPU 130. load-store units 140o, 140i,andmterosequencer 
148) In a given cycle of operation. Note that the INSTTYPE signal may b one or more bits as chosen for the particular 
Implementation, where multiple bits are likely if it is desired t distinguish various types of instruction from one another. 
Further, this same multiple bit concept may apply to oth r kientifying typ s of signal associated with oth r counters 
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discussed below. The second of the input signals, EXECUTE, is a controf signal which is asserted each time one or 
more of the execution units are operated in a given clock cycle. 

[0024] The operation of the inputs signals INSTTYPE and EXECUTE In connection with counter 1 SOq are as follows. 
EXECUTE is asserted when the corresponding functional circuitry (I.e., the execution units) on microprocessor 110 

s operates during a clock cycle. At this time, and In response to the assertions of EXECUTE, the type of instruction being 
executed is submitted by the INSTTYPE signal to took up mask bkx:k 1 56o- Recall that microprocessor 11 0 is preferably 
a superscalar devk:e and, therefore, by definition it may execute more than one instruction in a given clock cycle. Thus, 
one or more Instructton types may be executed at one time, and either in this case or In the case of a non-superscalar 
microprocessor the identity of the instruction(s) is provided to took up mask block ISBq by INSTTYPE. In response, 

10 look up mask bkx:k ISSq determines if any of the one or more instructions are to affect the value of counter ISOq. In 
other words, look up mask block 1 56o is preferably pre-programmed to mask certain Instructions such that they do not 
affect the advancement of counter 150q, while other instructions do indeed affect the advancement of counter 150^. 
For example, assume that kx)k up mask bk)ck 156o Is pre-programmed to mask any Instance of an instmctbn INST1 
from affecting the count of counter ISQq, but Is not pre-programmed to mask any instance of an Instmction INST2 from 

IS affecting the count of counter ISOq. Assume also in a given cycle of operation of microprocessor 110 that both instruc- 
tions INST1 and 1NST2 are executed. Thus, the INSTTYPE signal inputs values identifying both INST1 and INST2 to 
look up mask block 1 56q. In response, look up mask block ISSg masks any effect that INST1 may othenn/ise have on 
the count of counter 1 SOq; to the contrary, mask block 156o does not mask the effect of INST2. Specifically, look up 
mask btock 1 56o asserts a corresponding advance signal AD Vq to counter 1 SOq. thereby causing the latter to increment 

^ (i.e., to advance) once corresponding to the executbn of instruction INST2. Given the connections and operational 
description of counter 150o and its associated signals and circuitry, one skilled in the art will therefore appreciate that 
over time (i.e., between periods where counter ISOq is reset as detailed later), the count of counter ISOq reflects the 
recent execution of those instructions which are not masked by look up mask block 1 56^. Having presented the example 
of counter ISOg, below are discussed various other microprocessor activities and their corresponding counters and 

2S circuitry, where once again It shouki be appreciated by one skilled in the art that a given counter will advance In response 
to recent incidences of the activity corresponding to that counter. 

[0025] The activity associated with counter 150^ relates to microprocessor interrupts as instigated by some type of 
on-chip action as opposed to certain types of interrupts which arise due to a signal being asserted, externally from the 
microprocessor, to an Interrupt pin on the microprocessor. By way of example of the types of interrupts associated with 

30 counter 150^, some interrupts occur in response to a specific instructton (e.g., an interrupt Instnjction). Still further, 
other interrupts contemplated within the present scope Include what are known In the art as excepttons, such as a 
page fault by way of example. In any event, associated with counter 150^ In the preferred embodiment is a took up 
mask block 156i, which receives two input signals INTTYPE and INT, and like look up mask block 156o provides an 
advancement signal ADV, to its corresponding counter (i.e., counter 150,). More particularly, the INTTYPE signal 

35 identifies the type of a given interrupt which is occumng or has just occurred during the relevant clock cycle. The INT 
signal is asserted to indicate the occurrence of the on-chip interrupt and to enable the comparison functionality of took 
up mask block 1 56, . In other words, when INT is asserted, took up mask block 1 56, compares the Indicatton or value 
of INTTYPE with any pre-programmed values to determine if the inten-upt at issue is to affect the count in counter 1 50, 
If no mask is present for the Identified interrupt, then counter 150, is advanced by asserting ADV,, thereby indicating 

40 a recent occurrence of the interrupt. To the contrary, if the particular inten-upt Is to be masked as detennined by look 
up mask block 1 56, , then the ADV, signal Is not asserted and, therefore, the interrupt at issue does not affect the count 
In counter 150,. 

[0026] The activity associated with counter 15(^ relates to either or both of microprocessor context switching and 
microprocessor task switching. As known in the art, both types of switching typically are manifested when certain 

4S registers (e.g., general purpose registers) are activated to output to storage the value(s) relating to the current context 
or task and then to input value(s) relating to the context or task to which the microprocessor is to be switched. In other 
words, the switch occurs where operattons are taken by the microprocessor to quit one context or task to commence 

' another. Typbally, task switching relates to a change in the appiicatton program to be run during a given instant in a 
multi-tasking environnrrent, whereas context switching Is more broadly defined to include any switch of CPU state 

so context such as when initiating an internjpt sen^ice routine. Additionally, associated with counter 1 5O2 in the preferred 
embodiment is a look up mask block 1 562, which receives two input signals CTTYPE and CTSWITCH, and once more 
look up mask block I662 provides an advancement signal ADVgto Its corresponding counter (i.e., counter 15O2). More 
particularly, the CTTYPE signal identifies the type of context or task to which the microprocessor is to switch, and the 
CTSWITCH signal is asserted to Indicate an occurrence of the context switch or task switch and to enable the com- 

ss parison functionality of look up mask btock 1 562. In response to the CTSWITCH signal, therefore, look up mask btock 
1 562 compares the value of the CTTYPE signal with any pre-programmed values to determine if th context switch or 
task switch at issu Is to affect the count In counter 1 5O2. Moreover, In a more complex embodiment, the functionality 
of look up mask block 1662 is based on a combination of both the context or task to which the microprocessor is 
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switching and the context or task from which it is switching, rather than only considering the context or task to which 
the microprocessor is switching. In this more complex case, mask block I662 is further coupled t receive some indi- 
cation f the present context or task. In either case, if no mask is pressnt tor the identified context r task infonmation. 
then counter 1 5O2 Is advanced by asserting ADVg. thereby Indicating a recent occurrence of the context or task switch. 

5 To the contfary, if the context or task switch Is to be masked as determined by look up mask block 1 662, then the ADVg 
signal is not asserted and. therefore, the context or task switch at issue does not atfect the count in counter ISOg. 
lastly, note that while Figure 2 illustrates a single counter directed to both context and task switchfrig. In an alternative 
embodiment separate counters (and corresponding took up mask blocks) could be used, with one directed to context 
switching and another directed to task switching. 

10 [00271 The activity associated with counter 15O3 relates to microprocessor cache writes. Recall from Figure 1 that 
microprocessor 110 generally Includes three cache structures, namely, level 1 cache 116 (which includes data cache 
116^ and instruction cache 116,), level 2 cache 114. and microcache 118. Given these structures, any one or more of 
these on-chip caches provides an activity for possible advancement of counter 1 SOg. More particularly, associated with 
counter ISOg In the preferred embodiment is a look up maskbtock 1663. which receives two Input signals CACHEW- 

IS TYPE and CWR, and look up mask block 1 663 provides an advancement signal ADV3 to corresponding counter 1 5O3. 
Note that the C ACHEWTYPE signal nnay in one embodiment identify to which of the caches the current write Is directed. 
As a further level of consideration giving rise to an alternative embodiment, note further that the CACHEVmPE signal 
may further encode the type of write for a given cache. For example, in the cache write art. there are various types of 
writes such as a write allocate or a write causing a victim writeback. In any event, the CWR signal is asserted to Indicate 

20 an occurrence of some type of cache write to an on-chip cache, thereby enabling the comparison functkwiality of look 
upmaskbkx;k1563. In response to the CWR signal, took up maskblockl 663 compares the value of the CACHE WTYPE 
signal wrth any pre-programmed values to determine if the write is to affect the count in counter 15O3. If no mask is 
present for the identified write, then counterlSOg is advanced by asserting ADV3, thereby indicatingarecento^^ 
of the cache write. To the contrary, if the particular cache write is to be masked as detemilned by look up mask block 

2S 1 563, then the ADV3 signal is not asserted and, therefore, the cache write at Issue does not affect the count in counter 
15O3. 

10028] The activity associated with counter 1 5O4 relates to microprocessor descriptor loads. More specifically, in the 
80x86 art. descriptors are data blocks, typically eight bytes long, which describe a segment (e.g.. system segment or 
application segment) or a gate. During various operations, these descriptors are modified by way of a load to the 

30 descriptor which is commonly stored in a descriptor register. Accordingly, yet another aspect of the preferred embod- 
iment is to have such descriptor loads provide the possibility of advancing counter 1 5O4. Also Included In the preferred 
embodiment is a look up mask block 1 564 which receives two Input signals DESCRTYPE and DWR. and is associated 
with counter 15O4 Further, look up mask block 1564 provides an advancement signal ADV4 to corresponding counter 
I5O4. Thus, the DWR signal is asserted to indicate an occurrence of some type of descriptor toad, thereby enabling 

35 the comparison functionality of look up mask block 1564. In response to the DWR signal, look up mask block 1564 
compares the value of the DESCRTYPE signal with any pre-programmed values to detemnine if the load is to affect 
the count in counter 15O4 If no mask is present for the identified descriptor load, then counter 15O4 is advanced by 
asserting ADV4. thereby indicating a recent occurrence of the descriptor load. To the contrary. If the particular descriptor 
load is to be masked as determined by took up mask block 1564. then the ADV4 signal is not asserted and, therefore. 

40 the descriptor load at issue does not affect the count in counter 1504. 

[0029] The activity associated with counter 1 5O5 relates to microprocessor mode switching. Again as known in the 
80x86 art. mode switching occurs typically to accommodate erther protection features as well as compatibility issues 
for past and present software. For example, the current state of the art in 80x86 mteroprocessors pemfirls switches 
between modes such as 16 bit protected mode. 32 bit protected mode. V86 mode, and real mode. Further, associated 

45 with counter ISOg in the prefered embodiment is a look up mask block 1 665. which receives two input signals NMO 
DETYPE and MSWITCH. where look up mask block 1 SOg like others above provides an advancement signal ADV5 to 
Its corresponding counter ISOg. The NMODETYPE signal kJentifles the type of mode to which the microprocessor Is 
to switch, and the MSWITCH signal is asserted to indicate an occurrence of the mode switch and to enable the com- 
parison functionality of look up mask block 1665. In response to the MSWITCH signal, therefore, look up mask block 

so 1 565 compares the value of the NMODETYPE signal with any pre-programmed values to determine if the mode switch 
at issue is to affect the count in counter 1 5O5. Again, in a more complex embodiment, the functionality of look up mask 
btock 1565 Is based on a combination of both the mode to which the microprocessor is switching and the mode from 
which it is switching, rather than only considering the mode to whtoh the mteroprocessor Is switching. In this more 
complex case, mask block 1665 is further coupled to receive some indication of the present mode of operation. As an 

55 example of this more c mplex scenario, th refore. look up mask block 1 665 could b pre-programmed to mask from 
the count a switch from 16 bit protected mode to 32 bit protected mode, but not to mask a switch from real mode to 32 
bit protected mode. Thus, m the example both switches are t the same mode (i. .. 32 bit protected mode) yet only 
one advances the count due to th combined considerattons of tti mode of the microprocessor before and after the 
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mode switch. In any event, if no mask for the current switch information is present, then counter 1 50$ is advanced by 

asserting ADV5, thereby indicating a recent occurrence of the mode switch. To the contrary, {f the mode switch at issue 
is to be masked as determined by look up mask block 1565. then the ADVg signal is not asserted and, therefore, the 
mode switch at Issue does not affect the count in counter 150g. 

[0030] The activity associated with counter 1 50^ relates to microprocessor bus cycles (or "bus transactions") relating 
to bus B d Figure 1 . It is known in the microprocessor art that different types of bus cycles may occur atong a micro- 
processor bus. For example, bus cycles may Include a memory read, a menrn^ry write, a burst memory cycle, and still 
others. In this regard in the preferred embodiment a look up mask block 1 56^ receives two Input signals BUSC YCTYP E 
and BUSCYC associated with counter ISOg. Look up mask block ISSg provides an advancement signal ADVg to cor- 
responding counter 1 SOg. The BUSCYC signal is asserted to indicate an occun-ence of some type of bus cycle, thereby 
enabling the comparison functionality of look up mask block 156g. In response to the BUSCYC signal, look up mask 
block 156e compares the type of bus cycle at Issue, as indicated by the value of the BUSCYCTYPE signal, with any 
pre-programmed values to detemiine if the cycle is to affect the count in counter ISOe. If no mask is present for the 
identified bus cycle, then counter 150^ is advanced by asserting ADVg, thereby indicating a recent occurrence of the 
given type of bus cycle. To the contrary, if the particular bus cycle is to be masked as determined by look up mask 
block 1566. ^eri the ADVg signal is not asserted and, therefore, the current bus cycle does not affect the count in 
counter 150^. 

[0031] Having detailed the activities and connecttons with counters ISOq through 150^. note nowthat counter 150,^ 
is included to demonstrate that one or more other on-chip activities likewise could be used either in lieu of. or in addition 
to, any of the activities descrtbed above. Thus, in a general sense, this additional activity or activities is representative 
of the operation of some type of on-chip functional circuitry of microprocessor 110, and could affect the counts of 
corresponding counters. Moreover, this additional activity or activities could be filtered through the use of a comparable 
look up mask block, as shown by look up mask block 1 56,^ in Figure 2. Consequently, and in a manner comparable to 
the many circuits described above, look up mask block 1 56^ provides an advancement signal ADVf^ to corresponding 
counter 1 50^. again by comparing the activity type at issue (as encoded by ACTTYPE) in response to assertion of the 
ACTOCCUR signal which indicates that the activity Is occurring In the present clock cycle of operatfon of the micro- 
processor. Once again, therefore, the count in counter 150^ may selectrvefy advance based on whether look up mask 
block 1 56^, includes a mask for the then occurring activity Lastly, in addition to the count inputs to count evaluation 
function block 152 as described above, the preferred embodiment may include any one or nnore of the additionally 
illustrated counters 1 5O7 through 1 SO^q, each of which is described bebw. 

[0032] The activity associated with counter 15O7 Is comparable in some respects to that of counter 150o. but applies 
to the execution of a sequence of instnjcttons rather than an individual Instruction. l\^ore particularly, note in connection 
with counter 15O7 that the INSTTYPE and EXECUTE signals used In connection with counter 50o are once again used 
as inputs, but here they are first connected to a sequence storage block 158. In the preferred embodiment, sequence 
storage block 158 Is operable to store a record of the past M instructions in the sequence of instructions as executed 
by microprocessor 110. Thus, assuming that the sequence of the instructfons passbig through the microprocessor 
pipeline may be discerned from the INSTTYPE signal, then each time EXECUTE is asserted sequence storage block 
158 updates its list of M Instructions to Include the most recently executed instructions. In an out-of-order context, one 
skilled in the art will appreciate that additional Input and/or control may be necessary to accomplish this function. In all 
events, given a functionality of sequence storage block 158 to Identify the sequence of the past M Instructions which 
were executed by microprocessor 110, It then outputs an kientifier of this sequence, shown as SEQ in Figure 2. The 
SEQ identifier Indicates that the past M Instructions include a given sequence of instructions, either formed by the M 
instnjcttons in their entirety or as a subset of those Instructions. The SEQ signal is Input to a look up mask block 156; 
which operates In a fashion similar to like circuitry described above. Thus. In response to the EXECUTE signal, look 
up mask block 1567 compares the SEQ value with any pre-programmed values to detemiine If the identified Instruction 
sequence Is to affect the count In counter 15O7. If no mask is present for the identified instruction sequence, then 
counter 150; is advanced by asserting ADV7, thereby indicating a recent occurrence of the Instruction sequence. On 
the other hand, if the particular instruction sequence is to be masked as determined by kx>k up nnask block 1 567, then 
the ADV7 signal is not asserted and, therefore, the current Instruction sequence does not affect the count in counter 
15O7. 

[0033] The remaining counters 150e through 150io represent yet additional on-chip activities which may be sepa- 
rately counted In the preferred embodiment, but which do not necessarily also work in conjunction with corresponding 
look up mask blocks as is preferably the case for the activities discussed above. Briefly examining the activities of 
counters 15O3 through 150^0* counter 150g is advanced In response to an occurrence of a hit in a translation lookaside 
buffer ('TLB'). In the pr fen-ed embodiment, such a hit may be in either ne or both of data i^TLB 138 or instruction 
^iTLB 122, or still additional levels of TLBs as Included on the microprocessor Integrated circuit. Recall from eariier. 
and as known in the art, thes devices pemnit a preliminary access to address translations from a cache like structure 
rather than having to pertonn a possible table walk in memory to translate a virtual addr ss to a physical address. 
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Thus, a write to one of these TLBs, as manifested by an assertion of the TLBWR signal, preferably advances counter 
150a In a similar manner, counter 15O9 is advanced when a descriptor cache register located on th microprocessor 
Integrated circuit Is written. Thus, a write to one of the descriptor cache registers, as manifested by an assertion 0I the 
DESCCACHE signal, preferably advances counter ISOg. Lastly, it Is cont mplated that microprocessor 110 Includes 

5 speculative Instruction fetching based on branch prediction, where vartous techniques of such speculative activity are 
known or will be ascertainable by one sidlled In the art. Given such technology. It is known to include a signal, commonly 
derived from the execution unit of the mtoroprocessor. which Is asserted <»ice it is discovered that a speculative in- 
struction fetch has resulted in a mispredicted branch. In the prefen-ed embodiment, this signal Is either copied directty 
as. or provokes assertion of. the BRMISPR signal as directed to counter ^50^o Tb^^s, each assertion of the BRMISPR 

10 signal preferably advances counter ISOiq. Also In the context of mispredicted branches, note further that activities 
other than the actual instruction executton may indicate a mispredicted branch, such as a return stack wnte or a branch 
target buffer ("BTB") write. Thus, these additional activities also may be counted by including their occurrences as 
assertions of the BRMISPR signal. ui u 

[0034] Figure 3 illustrates a flowchart of the preferred method 160 of operation of count evaluation function block 

IS 1 52 from Figure 2. At the outset, recall that count evaluation function block 1 52 is preferably Implemented in software 
as executed by circuitry within microprocessor 110. Thus, the software may reside In various fashions as known in the 
art, such as on external storage from mtoroprocessor 110 or on firmware included within the integrated circuit (e.g., 
microcode ROM 146). In any event, the preferred steps of method 160 commence at step 162. Step 162 reads the 
values of one or more of counters 1 50o through 1 50^. In this regard, and as further appreciated later, note that vartous 

20 embodiments within the present inventive scope may be created by including only one or more counters 1 5Qa through 
150n and, in such an approach, only the counts in those subset of counters are available for reading in step 162. Still 
further, regardless of the number of on-chip activity counts available, step 162 could be altered by one skilled in the 
art to selectively read only some off those counters in some instances, while still others in other instances. As yet 
another modification, while Figures 1 and 2 are directed to a single microprocessor with counters reflecting on^ip 

2S activity for that circuit, the methodology of Figure 3 may be further enhanced in a system where c^er off-chip counters 
are also available for reading by step 162 to provide an Indication of off-chip activities to be combined with the infor- 
mation provided by one or more of the on-chip counters. Once the desired counts are read In step 162, method 160 
continues to step 163. ..rA k 

[0035] In step 163, count evaluation function block 162 asserts the RESET signal to each of counters 1 5% through 

30 1 50n (or whatever activity measuring counters are implemented for a given embodiment). As suggested by its name, 
the assertion In this manner of the RESET signal causes each counter to reset to Its initial value (e.g.. zero). Thus, 
following this resetting step, each of the affected counters may once again commence counting its corresponding on- 
chip activity (or off-chip activity if an off-chip counter is also implemented and assuming It is reset at the same time). 
Note therefore that the counter reset is preferably immediately after, or in response to. the act of reading the counter 

35 as discussed with respect to step 1 62. above. In any event, having reset the counters, for a future Iteration of step 1 62, 
that is, where H is later once again desired to read the value of the counters, a new count is available for each reset 
counter. Next, method 160 continues to step 164. 

[0036] Step 1 64 calculates a prediction of CPU busyness based on the amount of on-chip activity of microprocessor 
110 as manifested by the counts read from step 162. In other words, given the preceding teachings, one skilled in the 

40 art will now appreciate that more advanced counts for the various counters will give rise to a prediction of a higher 
incidence of the activity corresponding to those counts. For example, if counterlSOo Is advanced by Incrementing each 
time any Instmctlon is executed, then a relatively large count In that counter Indfeates that the on-chip activity of mi- 
croprocessor instruction execution has been relatively frequent since the last time counter 1 5Qo was reset and further 
gives rise to a prediction of how busy microprocessor 1 1 0 will be until the next time the counter is read (i.e., a prediction 

45 of future on-chip busyness). Naturally, this concept applies also to the other counts as welt. Given this information 
step 164 is able to consider the various counts to devetop a prediction for how busy microprocessor 11 0 will be until 
the next time the relevant counters were reset. In this regard, note that the particular methodology for consldenng the 
various counts, such as which counts to consider, possible weights of certain counts, and the significance of the relative 
values of each count, may be selected by one skilled in the art. For example, if counter 1 664 indicates a high incidence 

so of descriptor cache writes then the Importance of branch mispredictions (as evidenced by the counts in counter ISOiq) 
is preferably less significant, whereas if the incidence of descriptor cache writes is low then a high count of branch 
mispredlctionsis likelyvery important toapredkrtion of future micioprocessorbusyness.lnany event, stepie^ 
sometype of analytlcalfunction on one or more of the counts (shown as 'f{COUNT(S)lNn Figure 3). where that f 

may be deterministic or adaptive in nature. Moreover, the actual function implemented In step 164 may depend on 
ss various attributes of the actual microprocessor in which the present embodiment is Implemented, and further could be 
adjusted based on the syst m environment in which the mtoroprocessor is used. In any event, note that the actual 
function and significance attributed to the Individual counts may be selected and modified by a person skiHed in the 
art and preferably when completed reaches some type of result as reflected by a final calculated number Once this 
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number is ascertained, method 160 continues to step 166. 

[0037] Step 166 compares the result ot the function from step 164 to one or more thresholds, where the threshold 
(s) indicates a quantity which If reached represents a prediction that there will be a certatn amount of on-chip activity 
occurring within microprocessor 110. Note that the threshold(s) may be fixed, or preferably are programmable such as 
s by having values which may be written to registers located on the microprocessor. In this regard, note further that 
where programmable threshold(s) are used, they may be dynamically alterable such that some functionality, which 
preferably is a task independent from method 160, updates the threshold value(s) from time to time so that each time 
step 166 occurs it may actually Involve one or more different threshold values. 

[0038] Step 168 determines whether an adjustment is necessary to the CLK signal speed based on the step 164 

10 prediction of microprocessor busyness and the comparison or comparisons of step 1 66. For example, if a single thresh- 
old Is used in step 166, then step 168 may respond based on whether the predicted busyness value is either above 
or below the single threshold (with, of course, an appropriate result also being reached if the predicted busyness value 
actually equals the single threshold). On the other hand, if multiple thresholds are used in step 166, then based on a 
comparison of the calculated busyness value to each threshold, a determination may be made as to appropriate clock 

IS speed for the microprocessor based on the comparison. As a fairly simplistic example, therefore assume that step 1 66 
implements four thresholds, and assume further that the calculated busyness value Is greater than the smaller two 
thresholds but less than the larger two thresholds. Thus, it may be concluded that it is desired for microprocessor 1 1 0 
to thereafter operate at one-half of its maximum capable clock speed. Given this approach or still others as will be 
ascertainable by one skilled in the art, in step 1 6B count evaluation function bkx:k 1 52 outputs a value of the AD J signal 

20 to clock speed generator 1 54 so as to either increase, decrease, or not change the speed of the CLK signal. In response, 
one skilled in the art will appreciate that the ADJ signal therefore may adjust the speed of operation of microprocessor 
110 based on the predicted microprocessor busyness. Naturally, In the event that microprocessor operation speed is 
reduced, then at the same time there is a reduction in the amount of power consumed by microprocessor 110. Sum- 
marizing the effect of method 160 to this point, therefore, step 168 accomplishes a selective power adjustment by 

2S mteroprocessor 110^ response to predicting the anticipated busyness of ft based on its recent on-chip activity. There- 
after, method 160 continues to step 170. 

[0039] Step 170 represents a wart period before method 160may retumtostep 162 toonce more read the appropriate 
counts and again proceed with the next detemriination of the on-chip activity in microprocessor 110. Note that step 170 
may be accomplished in various manners giving rise to different embodiments. As a first embodiment, method 170 

30 may be accomplished by waiting for a predetermined amount of time, where that time may be measured by counting 
clock cycles. In this regard, note further that it may be prefen'ed to further compensate given the value of the/ADJ 
signal. Specifically, if it is known from step 168 that the speed of the CLK signal has been reduced, then for a given 
amount of time fewer clock transitions will occur than if the CLK signal has not been reduced. Thus, if it is desired to 
accomplish the wait state of step 170 based on the passage of an absolute amount of time, then the required number 

3S of clock cycles which must elapse to reach that time will be relatively smaller if the speed of the CLK signal has been 
reduced in response to the ADJ signal. As another consrcleratlon, however, If the speed of the CLK signal is reduced 
by gating a general digital signal as described above, then it is likely that the general digital signal is also available for 
counting so as to measure the passage of time. As a second embodiment, step 170 nr^ay count clock cycles regardless 
of the then current speed (as reflected by the CLK signal). As a third embodiment, step 170 may await a certain event 

40 or events, where It desired that such an event then triggers the repetitbn of the method starting at step 162. In this 
last approach, note that the awaited event may be some type of interrupt routine, that is, an instructbn or event which 
actually interrupts the operation of microprocessor 110 and in effects requests that method 160 repeat commencing 
with step 162. For example, a separate timer could be Included within microprocessor 110 which, when reaching its 
timeout limit, requests the interrupt which Instigates the steps of method 160. (n any event, one skilled in the art will 

<s appreciate that method 160 is preferably repeated during the operation of microprocessor 110 and, during such repe- 
tition monitors the on-chip activity and responds in the manner set forth above. 

[0040] IHaving presented various approaches with the inventive scope of the present embodiments, note the con- 
' siderations with respect to setting the limit of counts for counters 150^ through 150|yj. As discussed above, the counts 
for each of these counters advances in response to each incidence of a given activity (i.e., assuming that activity is 
so riot masked from affecting Its corresponding counter). As a first counter consideration, one skilled in the art should now 
appreciate the effect of setting the saturatable limit of each counter Particularly, if a given counter reflects the Incidence 
of a given activity, and that incidence at some point Is indicative of sufficient busyness of microprocessor 110, then 
there Is no need to allow the count of that counter to exceed this point. Accordingly, for each activity monitored by a 
counter, one skilled in the art will be able to ascertain the number of occurrences for that activity, given the expected 
ss period of the wait state of step 170. which represents a sufficient busyness of microprocessor 110. Thus, the given 
counter may be configured to saturate at that number. Thereafter, once th count reaches that number, which Is read 
by step 162, it may be recognized by the function of step 164 to Indicate sufTicient microprocessor activity so as not to 
reduce th clock speed of the microprocessor. In addition to the above, note further that an alternative embodiment 



10 



EP0901063 A2 



may b achieved using counter limits which are noft intended to saturate; in other words, during a given cycle the count 
is expected to advance but not to reach any limit. In this regard, note that some microproc ssors may include counters 
which advance for certain performance evaluations not directed to power consumption. As such, the count from these 
count rs may be reviewed consistent with the present teachings t perform a power reduction aspect for the micro- 
5 processor. 

[0041] Another conslderallon with respect to the counters implemented consistent with the present embodiments is 
directed to having counters which advance in both directions. More particularly, the above description focuses on 
counters which advance only in a single direction (e.g., Incrementing from zero toward a limit) between successive 
resets of the counter, where advancement occurs In response to successive incidences of a given activity In an alter- 

10 native embodiment, however, it may be desirable to have a counter advance in opposite directions based on whether 
Its corresponding activity is occumng. In one such approach, a counter advances in a first direction (e.g. . Incrementing) 
for each ciocic cycle when Its con-esponding activity occurs, but advances in a second and opposite direction (e.g., 
decrementing) for each cIoci< cycle when its corresponding activity does not occur. Indeed, in this alternative approach, 
still another aspect contemplated within the inventive scope is to have the counter increment linearly in one direction 

f5 and on a logarithmic basis in the other direction as is sometimes used in the counter art but in manners unrelated to 
reducing microprocessor power consumption based on on-chip busyness. In this alternative approach, the counter 
preferably advances linearly toward its saturatable limit for each occurrence of the activity but advances in a logarithmic 
manner away from the saturatable limit for each non-occurrence of the activity. Note the effect of such an approach 
may be desirable for the following reasons. Assume that a given activity has occurred quite frequently since the last 

20 reset of Its corresponding counter. Intuitively, therefore, the counter is going to tend toward a result which suggests 
not reducing the clock speed of microprocessor 110. Assume now during that same cycle (i.e., between resets of the 
counter) that the activity becomes less frequent toward the end of the cycle. If the counter decrements linearly, then it 
may soon indicate a count which suggests it is desirable to reduce the clocic speed of microprocessor 110. However, 
since It may be undesirable to take such an action prematurely, the alternative of decrementing the counter in a toga- 

2S rithmic fashion reduces the effect of the non-occurrences relative to the occurrences of the activity and, thus, reduces 
the likelihood that microprocessor 110 will prematurely receive a reduced CLK signal speed. Given the two different 
bases for advancing a counter, note as an alternative that they may be switched such that the counter advances in a 
togarithmic manner toward Its saturatable limit for each occurrence of the activity but advances linearly away from the 
saturatable limit for each non-occurrence of the activity, lastly, note further in an embodiment where the counters are 

30 permitted to advance in both directions that method 1 60 may be further modified such that the reset only occurs at a 
single lime. In other words, if the counters are perm'rtted to show a lack of activity over time by slowing decrementing 
(or otherwise advancing In a direction opposite of that which is taken when the activity occurs), then the counters do 
not necessarily require multiple resetting events since the decrementing count will approximate the effect of a reset if 
a given activity does not occur over a sufficient amount of time. Thus, a one time reset could suffice in this approach. 

35 and the one time reset could occur at system start-up or some other time as chosen by one skilled In the act. 

[0042] Still another consideration with respect to the counters and consistent with the present embodiments is di- 
rected to mixing the activities associated with a single counter. In other words, the above description of Figure 2 dem- 
onstrates counters where each is associated with a different type of activity. Note further, however, that various on- 
chip activities could be combined within a single counter where an occurrence of each such activity advances the 

40 counter. As yet another altematlve, note that the counter could also include some type of device comparable to se- 
quence storage block 158 whereby a sequence of activities was stored so that only a given group or sequence of 
differing activities advances the corresponding counter. In the sense of a group of activities, this could be accomplished 
through a logical AND function of the activities as stored, so that the counter advances only if two (or more) activities 
occur between successive resets of the counter. In the sense of a sequence of activities, a more complex logical 

4S structure would be required which advances the counter only if two (or more) activities occur in a given sequential 
order, and between successive resets of the counter. Still other combinations will be ascertainable by one skilled in 
the art. 

[0043] From the above, it may be appreciated that the above embodiments provide various stmctures and method- 
ology for Improving the performance of a microprocessor. For example, an integrated circuit implemented the present 

so teachings may implement more aggressive power reduction techniques based on the myriad information available to 
the system from its various counters. As another example, power reduction is directed directly to the integrated circuit 
rather than to some peripheral, and In many instances this is highly appropriate where the integrated circuit is itself a 
considerable user of system power. Still other benefits will be appreciated by one skilled in the art. Still further, as 
another benefit note that while the present embodiments have been described in detail, various substitutions, modifi- 

ss cations or alterations couW be made to the descriptions set forth above without departing from the inventive scope. 
Indeed, numerous different examples have b en provW d in the preceding text. Another example by way of illustration 
is to elimtnat various of the look up mask blocks described earlier, such that each Incidence of a given activity advances 
the corresponding counter. As another example, some of the activities described earlier tmy b further subdivided into 
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narrower groups of activities, each having its own corresponding counter. For example, in the context of instruction 
execution, a first instruction count r could advance for an execution of any floating point instmction, a second instruction 
count r couid advance for an execution of any privileged instruction, a third instoiction counter could advanc for an 
execution of any memory access instruction, and so forth. In this regard, the increased number of counters permits 
the ability to further distinguish various activities and, therefore, to further refine the response (I.e., through the function 
of step 164) to those activities. As still another exanriple, note that various steps in Figure 3 may be reordered and. 
indeed, one skilled in the art may include additional steps as well. As still another example of the flexibility of the 
inventive scope, it should be noted that the microprocessor of Figure 1 is provided only by way of example, and that 
the present teachings apply to other microprocessors as well. Thus, these examples as well as additional ones ascer- 
tainable by one skilled in the art further demonstrate the present inventive scope, which is defined by the following 
claims. 



Clalnis 

1. A method of operating a microprocessor having at least one counter located on the microprocessor, the method 

comprising the steps of: 

operating functional circuitry on the microprocessor over a plurality of clock cycles, wherein the operation may 
cause an on-chip activity to occur during the plurality of clock cycles; 

advancing a count In the at least one counter in response to each incidence of the on-chip activity; 
after the advancing step, the steps of: 

predicting the busyness of the microprocessor in response to the count in the at least one counter; and 
selectively adjusting power corisumptton of the microprocessor in response to a comparison of the pre- 
dicted busyness with a threshold. 

2. The method of Claim 1 : 

wherein the at least one counter consists of a plurality of counters; - * 

wherein the step of operating functbnal circuitry on the microprocessor over a plurality of clock cyclesmay 
cause a plurality of on-chip activities to occur during the plurality of clock cycles; 

and further comprising the step of, for each of the plurality of counters, advancing a count in the counter In 
response to an incidence of a corresponding one of the plurality of activities; 

and wherein the step of predating the busyness of the microprocessor further comprises predicting the busy- 
ness of the microprocessor in response to the counts in each of the plurality of counters. 

3. The method according to any preceding Claim: 

and further comprising the step of generating a type signal indicating a type of tlie on-chip activity; and 
wherein the step of advancing a count in the at least one counter in response to each incidence of the on-chip 
activity comprises: 

receiving the type signal into a mask circuit; 

detemriining whether the type signal con-esponds to a type of on-chip activity to be masked from affecting the 
count in the at least one counter; and 

in response to the detemniriing step determining that the type signal does not correspond to a type of on-chip 
activity to be masked from affecting the count In the at least one counter, advancing the count in the at least 
one counter in response to each incidence of the on-chip activity; and 

in response to the determining step determining that the type signal corresponds to a type of on-chip activity 
to be masked from affecting the count In the at least one counter, the step of not advancing the count in the 
at least one counter. 

4. The method according to any preceding Claim and further comprising the step of resetting the count in the at least 
one counter prior to the step of advancing a count in the at least one counter. 

5. The method according t any preceding Claim and further comprising the step of resetting the count in th at least 
one counter after the step of selectively adjusting power consumption of the microprocessor. 
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6. Th method according to any preceding Claim and further comprising, prior to the step of selectively adjusting 
power consumption of the microprocessor, the steps of: 

reading the count in the at least on counter, wherein the step of predicting the busyness is resp nsive to the 
s reading step; and 

responsive to the reading step, resetting the count in the at least one counter. 

7. The method according to any preceding Claim: 

10 wherein the step of advancing a count comprises advancing the count in a first direction; and 

further comprising the step of advancing the count in a second direction opposite the first direction during the 
plurality of clock cycles. 

8. The method according to any preceding Claim wherein the step of selectively adjusting power consumption of the 
IS microprocessor in response to a comparison of the predicted busyness with a threshold comprises reducing power 

consumption of the microprocessor if the predicted busyness Is below the threshold. 

9. The method of Claim 1 : 

20 and further comprising receiving Into the microprocessor a count from a counter of circuitry activity external 

from the microprocessor; and 

wherein the step of predicting the busyness of the microprocessor in response to the count in the counter 
comprises predicting the busyness of the microprocessor in response to the count in the counter in combination 
with the count from a counter of circuitry activity external from the microprocessor. 

25 

10. The method of Claim 1: 

wherein the at least one counter consists of a plurality of counters; 

wherein the step of operating functional circuitry on the microprocessor over a plurality of clock cycles may 
30 cause a plurality of on<hip activities to occur during the plurality of clock cycles; 

and further comprising the steps of: 

for each of the plurality of counters, advancing a count in the counter In response to an incidence of a 
corresponding one of the plurality of activities; and 
3S receiving into the microprocessor a count from a counter of circuitry activity extemal from the microproc- 

esson and 

wherein the step of predicting the busyness of the microprocessor further comprises predicting the busyness 
of the microprocessor in response to the counts in each of the plurality of counters in combination with the 
40 count from the counter of circuit activity external from the microprocessor. 
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